Fast XML Structural Join Algorithms by Partitioning
نویسندگان
چکیده
An XML structural join evaluates structural relationships (e.g. parent-child or ancestordescendant) between XML elements. It serves as an important computation unit in XML pattern matching. Several classical structural join algorithms have been proposed such as Stack-tree join and XR-Tree join. In this paper, we consider to answer the problem of structural join by partitioning. The Dietz numbering scheme is used for encoding since nodes with the Dietz encodings could be well distributed on a plane. We first extend the relationships between nodes to the relationships between partitions on a plane and obtain some observations and properties about the relationships between partitions. We then propose a new partition-based method, named P-Join for structural join between ancestor and descendant nodes based on the properties derived from our observations. Moreover, we present an enhanced partitioned-based structural join algorithm and two optimized methods. Extensive experiments show that the performance of our proposed algorithms outperform that of Stack-tree and XR-Tree algorithms. In order to store the partitioning results, we design a simple but efficient index structure, called PSS-tree. The experimental result shows that it has less maintenance overhead than XR-Tree.
منابع مشابه
Accelerating XML Structural Join by Partitioning
Structural join is the core part of XML queries and has a significant impact on the performance of XML queries, several classical structural join algorithms have been proposed such as Stack-tree join and XR-Tree join. In this paper, we consider to answer the problem of structural join by partitioning. We first extend the relationships between nodes to the relationships between partitions in the...
متن کاملStructural Joins: a Primitive for Eecient Xml Query Pattern Matching
XML queries typically specify patterns of selection predicates on multiple elements that have some speciied tree structured relationships. The primitive tree structured relationships are parent-child and ancestor-descendant, and nding all occurrences of these structural relationships in an XML database is a core operation for XML query processing. In this paper, we develop two families of struc...
متن کاملFast and Tiny Structural Self-Indexes for XML
XML document markup is highly repetitive and therefore well compressible using dictionary-based methods such as DAGs or grammars. In the context of selectivity estimation, grammar-compressed trees were used before as synopsis for structural XPath queries. Here a fully-fledged index over such grammars is presented. The index allows to execute arbitrary tree algorithms with a slow-down that is co...
متن کاملLabeling Scheme and Structural Joins for Graph-Structured XML Data
When XML documents are modeled as graphs, many challenging research issues arise. In particular, query processing for graphstructured XML data brings new challenges because traditional structural join methods cannot be directly applied. In this paper, we propose a labeling scheme for graph-structured XML data. With this labeling scheme, the reachability relationship of two nodes can be judged e...
متن کاملAmoeba Join: Overcoming Structural Fluctuations in XML Data
There are no universal rules for organizing data in XML. Consequently, semantically identical XML documents may have different structures; we call this structural fluctuation in XML. Finding all the structural fluctuations in an XML document requires verbose path expression queries. To overcome this problem, we developed a novel query processing primitive, called amoeba join. Amoeba join does n...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Journal of Research and Practice in Information Technology
دوره 40 شماره
صفحات -
تاریخ انتشار 2008